Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix whisper #1037

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Fix whisper #1037

wants to merge 2 commits into from

Conversation

csukuangfj
Copy link
Collaborator

Fixes #633

@szaszakgy

Could you use this PR to test the wave failing to decode?

Please use first the test.py from this PR. You need to re-export the model using the latest export-onnx.py from this PR.

I will fix the C++ code tomorrow.

CC @GaryLaurenceauAva

@szaszakgy
Copy link

Hi @csukuangfj , thanks for the feedback!
I am able to run test.py on the original recording. It returns a transcription result, which is however chopped, the last 9 words are missing compared to testing on the shared file problem_01.wav, which returns a perfect result. I tested 3 more problem recordings. 2 of them return a transcript, which is chopped compared to original whisper. One of them still returns with the previous failure 'INVALID_ARGUMENT : Non-zero status code returned while running Expand node. Name:'/Expand' Status Message: invalid expand shape' . This recording contains repetitions (self corrections or stuttering), but can be decoded with original whisper without issues.

@csukuangfj
Copy link
Collaborator Author

could you share the problematic wav and tell us which model you are using?

@szaszakgy
Copy link

szaszakgy commented Jun 21, 2024 via email

@thewh1teagle
Copy link
Contributor

thewh1teagle commented Aug 9, 2024

Tested with whisper models with DirectML / CPU on Windows with the newly exported models.

tiny.int8 CPU (success)

python .\scripts\whisper\export-onnx.py --model tiny
python .\scripts\whisper\test.py --encoder .\tiny-encoder.int8.onnx --decoder .\tiny-decoder.int8.onnx --tokens tiny-tokens.txt --language en --task transcribe sherpa-onnx-whisper-medium\test_wavs\0.wav
2024-08-09 18:05:44.3137218 [W:onnxruntime:, session_state.cc:1166 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2024-08-09 18:05:44.3223491 [W:onnxruntime:, session_state.cc:1168 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
2024-08-09 18:05:44.9778383 [W:onnxruntime:, session_state.cc:1166 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2024-08-09 18:05:44.9866576 [W:onnxruntime:, session_state.cc:1168 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
After early nightfall the yellow lamps would light up here and there the squalid quarter of the brothels.

tiny.int8 DML (success)

python .\scripts\whisper\test.py --encoder .\tiny-encoder.int8.onnx --decoder .\tiny-decoder.int8.onnx --tokens tiny-tokens.txt --language en --task transcribe sherpa-onnx-whisper-medium\test_wavs\0.wav
2024-08-09 18:24:38.9712836 [W:onnxruntime:, session_state.cc:1166 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2024-08-09 18:24:38.9799328 [W:onnxruntime:, session_state.cc:1168 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
2024-08-09 18:24:39.4824920 [W:onnxruntime:, session_state.cc:1166 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2024-08-09 18:24:39.4912264 [W:onnxruntime:, session_state.cc:1168 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
After early nightfall the yellow lamps would light up here and there the squalid quarter of the brothels.

medium.int8 CPU (success)

python .\scripts\whisper\export-onnx.py --model medium
 python .\scripts\whisper\test.py --encoder .\medium-encoder.int8.onnx --decoder .\medium-decoder.int8.onnx --tokens .\medium-tokens.txt --language en --task transcribe sherpa-onnx-whisper-medium\test_wavs\0.wav
After early nightfall the yellow lamps would light up here and there the squalid quarter of the brothels.

medium.int8 DML (failed)

python .\scripts\whisper\test.py --encoder .\medium-encoder.int8.onnx --decoder .\medium-decoder.int8.onnx --tokens .\medium-tokens.txt --language en --task transcribe sherpa-onnx-whisper-medium\test_wavs\0.wav       
2024-08-09 18:22:35.7186952 [W:onnxruntime:, session_state.cc:1166 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2024-08-09 18:22:35.7283108 [W:onnxruntime:, session_state.cc:1168 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
2024-08-09 18:22:40.3298896 [W:onnxruntime:, session_state.cc:1166 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2024-08-09 18:22:40.3379720 [W:onnxruntime:, session_state.cc:1168 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
2024-08-09 18:22:45.4154322 [E:onnxruntime:, sequential_executor.cc:516 onnxruntime::ExecuteKernel] Non-zero status code returned while running MemcpyToHost node. Name:'Memcpy_token_172' Status Message: D:\a\_work\1\s\onnxruntime\core\providers\dml\DmlExecutionProvider\src\MLOperatorAuthorImpl.cpp(2557)\onnxruntime_pybind11_state.pyd!00007FF9A58D300E: (caller: 00007FF9A601D211) Exception(3) tid(2f14) 887A0006 The GPU will not respond to more commands, most likely because of an invalid command passed by the calling application.

Traceback (most recent call last):
  File "D:\sherpa\sherpa-onnx\scripts\whisper\test.py", line 415, in <module>
    main()
  File "D:\sherpa\sherpa-onnx\scripts\whisper\test.py", line 370, in main
    logits, n_layer_self_k_cache, n_layer_self_v_cache = model.run_decoder(
                                                         ^^^^^^^^^^^^^^^^^^
  File "D:\sherpa\sherpa-onnx\scripts\whisper\test.py", line 154, in run_decoder
    logits, out_n_layer_self_k_cache, out_n_layer_self_v_cache = self.decoder.run(
                                                                 ^^^^^^^^^^^^^^^^^
  File "C:\Users\User\.rye\py\[email protected]\Lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 220, in run
    return self._sess.run(output_names, input_feed, run_options)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running MemcpyToHost node. Name:'Memcpy_token_172' Status Message: D:\a\_work\1\s\onnxruntime\core\providers\dml\DmlExecutionProvider\src\MLOperatorAuthorImpl.cpp(2557)\onnxruntime_pybind11_state.pyd!00007FF9A58D300E: (caller: 00007FF9A601D211) Exception(3) tid(2f14) 887A0006 The GPU will not respond to more commands, most likely because of an invalid command passed by the calling application.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Whisper onnxruntime exception on Android
3 participants